Incremental-Topological-Preserving-Map-Based Fuzzy Q-Learning (ITPM-FQL)

نویسندگان

  • Meng Joo Er
  • Yi Zhou
چکیده

Reinforcement Learning (RL) is thought to be an appropriate paradigm to acquire policies for autonomous learning agents that work without initial knowledge because RL evaluates learning from simple “evaluative” or “critic” information instead of “instructive” information used in Supervised Learning. There are two well-known types of RL, namely Actor-Critic Learning and Q-Leaning. Among them, Q-Learning (Watkins & Dayan, 1992) is the most widely used learning paradigm because of its simplicity and solid theoretical background. In Q-Learning, Q-vectors are used to evaluate the performance of appropriate actions which are selected by choosing the highest Q-value in the Q-vectors. Unfortunately, the conventional Q-Learning approach can only handle discrete states and actions. In the real-world, the learning agent needs to deal with continuous states and actions. For instance, in robotic applications, the robot needs to respond to dynamically changing environmental states with the smoothest action possible. Furthermore, the robot’s hardware can be damaged as a result of inappropriate discrete actions. In order to handle continuous states and actions, many researchers have enhanced the Qlearning methodology over the years. Continuous Action Q-Learning (Millan et al., 2002) is one of the Q-Learning methodologies which can handle continuous states and actions. Although this approach is better than the conventional Q-Learning technique, it is not as popular as the Fuzzy Q-Learning (FQL) (Jouffe, 1998) because the former is not based on solid theoretical background. Whereas CAQL considers neighboring actions of the highest Q-valued action in generating continuous actions, the FQL uses theoretically sound Fuzzy Inference System (FIS). On the contrary, the FQL approach is more favorable than the CAQL. Thus, our proposed approach is based on the FQL technique. The FIS identification can be carried out in two phases, namely structure identification phase and parameter identification phase. The structure identification phase defines how to generate fuzzy rules while the parameter identification phase determines premise parameters and consequent parts of the fuzzy rules. The FQL approach mainly focuses to handle parameter identification automatically while structure identification still remains an open issue in FQL. To circumvent the issue of structure identification, the Dynamic Fuzzy Q-Learning (DFQL) (Er & Deng, 2004) is proposed. The salient feature of the DFQL is that it can generate fuzzy rules according to the ε-completeness and Temporal Difference criteria O pe n A cc es s D at ab as e w w w .in te ch w eb .o rg

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Online Learning Control Strategy for Hybrid Electric Vehicle Based on Fuzzy Q-Learning

In order to realize the online learning of a hybrid electric vehicle (HEV) control strategy, a fuzzy Q-learning (FQL) method is proposed in this paper. FQL control strategies consists of two parts: The optimal action-value function Q*(x,u) estimator network (QEN) and the fuzzy parameters tuning (FPT). A back propagation (BP) neural network is applied to estimate Q*(x,u) as QEN. For the fuzzy co...

متن کامل

Anomaly Detection using Fuzzy Q-learning Algorithm

Wireless networks are increasingly overwhelmed by Distributed Denial of Service (DDoS) attacks by generating flooding packets that exhaust critical computing and communication resources of a victim’s mobile device within a very short period of time. This must be protected. Effective detection of DDoS attacks requires an adaptive learning classifier, with less computational complexity, and an ac...

متن کامل

Cooperative game theoretic approach using fuzzy Q-learning for detecting and preventing intrusions in wireless sensor networks

Owing to the distributed nature of denial-of-service attacks, it is tremendously challenging to detect such malicious behavior using traditional intrusion detection systems in Wireless Sensor Networks (WSNs). In the current paper, a game theoretic method is introduced, namely cooperative Game-based Fuzzy Q-learning (G-FQL). G-FQL adopts a combination of both the game theoretic approach and the ...

متن کامل

Fuzzy Sarsa Learning and the Proof of Existence of Its Stationary Points

This paper provides a new Fuzzy Reinforcement Learning (FRL) algorithm based on critic-only architecture. The proposed algorithm, called Fuzzy Sarsa Learning (FSL), tunes the parameters of conclusion parts of the Fuzzy Inference System (FIS) online. Our FSL is based on Sarsa, which approximates the Action Value Function (AVF) and is an on-policy method. In each rule, actions are selected accord...

متن کامل

Genetic reinforcement learning of fuzzy inference system application to mobile robotic

An efficient genetic reinforcement learning algorithm for designing Fuzzy Inference System (FIS) with out any priory knowledge is proposed in this paper. Reinforcement learning using Fuzzy Q-Learning (FQL) is applied to select the consequent action values of a fuzzy inference system, in this method, the consequent value is selected from a predefined value set which is kept unchanged during lear...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012